Goto

Collaborating Authors

 dl algorithm





Unified Privacy Guarantees for Decentralized Learning via Matrix Factorization

arXiv.org Artificial Intelligence

Decentralized Learning (DL) enables users to collaboratively train models without sharing raw data by iteratively averaging local updates with neighbors in a network graph. This setting is increasingly popular for its scalability and its ability to keep data local under user control. Strong privacy guarantees in DL are typically achieved through Differential Privacy (DP), with results showing that DL can even amplify privacy by disseminating noise across peer-to-peer communications. Yet in practice, the observed privacy-utility trade-off often appears worse than in centralized training, which may be due to limitations in current DP accounting methods for DL. In this paper, we show that recent advances in centralized DP accounting based on Matrix Factorization (MF) for analyzing temporal noise correlations can also be leveraged in DL. By generalizing existing MF results, we show how to cast both standard DL algorithms and common trust models into a unified formulation. This yields tighter privacy accounting for existing DP-DL algorithms and provides a principled way to develop new ones. To demonstrate the approach, we introduce MAFALDA-SGD, a gossip-based DL algorithm with user-level correlated noise that outperforms existing methods on synthetic and real-world graphs.



Classification of 24-hour movement behaviors from wrist-worn accelerometer data: from handcrafted features to deep learning techniques

arXiv.org Artificial Intelligence

Purpose: We compared the performance of deep learning (DL) and classical machine learning (ML) algorithms for the classification of 24-hour movement behavior into sleep, sedentary, light intensity physical activity (LPA), and moderate-to-vigorous intensity physical activity (MVPA). Methods: Open-access data from 151 adults wearing a wrist-worn accelerometer (Axivity-AX3) was used. Participants were randomly divided into training, validation, and test sets (121, 15, and 15 participants each). Raw acceleration signals were segmented into non-overlapping 10-second windows, and then a total of 104 handcrafted features were extracted. Four DL algorithms-Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), Gated Recurrent Units (GRU), and One-Dimensional Convolutional Neural Network (1D-CNN)-were trained using raw acceleration signals and with handcrafted features extracted from these signals to predict 24-hour movement behavior categories. The handcrafted features were also used to train classical ML algorithms, namely Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), Logistic Regression (LR), Artificial Neural Network (ANN), and Decision Tree (DT) for classifying 24-hour movement behavior intensities. Results: LSTM, BiLSTM, and GRU showed an overall accuracy of approximately 85% when trained with raw acceleration signals, and 1D-CNN an overall accuracy of approximately 80%. When trained on handcrafted features, the overall accuracy for both DL and classical ML algorithms ranged from 70% to 81%. Overall, there was a higher confusion in classification of MVPA and LPA, compared to sleep and sedentary categories. Conclusion: DL methods with raw acceleration signals had only slightly better performance in predicting 24-hour movement behavior intensities, compared to when DL and classical ML were trained with handcrafted features.


Kernel Recursive Least Squares Dictionary Learning Algorithm

arXiv.org Artificial Intelligence

Data factorization methods have met with considerable success in discovering latent features of the signals encountered in wide-ranging applications. In this way, the representation bases, which make up the columns of the basis matrix or dictionary, are learned from the available samples of the target environment. An example is the sparse representation (SR) in which the dictionary is intended to best represent the data with a small number of atoms, much smaller than the dimension of the signal space. It has been shown that, in addition to a more informative representation of signals, imposing sparsity constraints on the representation coefficients can improve the generalization performance and the computational efficiency [1, 2, 3]. Furthermore, the sparse representation is more robust to noise, redundancy, and missing data. These features are mainly attributed to the fact that the intrinsic dimension of natural signals is usually much smaller than their apparent dimension and hence SR in an appropriate dictionary can extract these intrinsic features more efficiently. SR has been a successful strategy and has received considerable attention and achieved state-of-the-art results in many applications, e.g.


ECOSoundSet: a finely annotated dataset for the automated acoustic identification of Orthoptera and Cicadidae in North, Central and temperate Western Europe

arXiv.org Artificial Intelligence

Currently available tools for the automated acoustic recognition of European insects in natural soundscapes are limited in scope. Large and ecologically heterogeneous acoustic datasets are currently needed for these algorithms to cross-contextually recognize the subtle and complex acoustic signatures produced by each species, thus making the availability of such datasets a key requisite for their development. Here we present ECOSoundSet (European Cicadidae and Orthoptera Sound dataSet), a dataset containing 10,653 recordings of 200 orthopteran and 24 cicada species (217 and 26 respective taxa when including subspecies) present in North, Central, and temperate Western Europe (Andorra, Belgium, Denmark, mainland France and Corsica, Germany, Ireland, Luxembourg, Monaco, Netherlands, United Kingdom, Switzerland), collected partly through targeted fieldwork in South France and Catalonia and partly through contributions from various European entomologists. The dataset is composed of a combination of coarsely labeled recordings, for which we can only infer the presence, at some point, of their target species (weak labeling), and finely annotated recordings, for which we know the specific time and frequency range of each insect sound present in the recording (strong labeling). We also provide a train/validation/test split of the strongly labeled recordings, with respective approximate proportions of 0.8, 0.1 and 0.1, in order to facilitate their incorporation in the training and evaluation of deep learning algorithms. This dataset could serve as a meaningful complement to recordings already available online for the training of deep learning algorithms for the acoustic classification of orthopterans and cicadas in North, Central, and temperate Western Europe.


Demo: Multi-Modal Seizure Prediction System

arXiv.org Artificial Intelligence

This demo presents SeizNet, an innovative system for predicting epileptic seizures benefiting from a multi-modal sensor network and utilizing Deep Learning (DL) techniques. Epilepsy affects approximately 65 million people worldwide, many of whom experience drug-resistant seizures. SeizNet aims at providing highly accurate alerts, allowing individuals to take preventive measures without being disturbed by false alarms. SeizNet uses a combination of data collected through either invasive (intracranial electroencephalogram (iEEG)) or non-invasive (electroencephalogram (EEG) and electrocardiogram (ECG)) sensors, and processed by advanced DL algorithms that are optimized for real-time inference at the edge, ensuring privacy and minimizing data transmission. SeizNet achieves > 97% accuracy in seizure prediction while keeping the size and energy restrictions of an implantable device.


Atom dimension adaptation for infinite set dictionary learning

arXiv.org Artificial Intelligence

Recent work on dictionary learning with set-atoms has shown benefits in anomaly detection. Instead of viewing an atom as a single vector, these methods allow building sparse representations with atoms taken from a set around a central vector; the set can be a cone or may have a probability distribution associated to it. We propose a method for adaptively adjusting the size of set-atoms in Gaussian and cone dictionary learning. The purpose of the algorithm is to match the atom sizes with their contribution in representing the signals. The proposed algorithm not only decreases the representation error, but also improves anomaly detection, for a class of anomalies called `dependency'. We obtain better detection performance than state-of-the-art methods.